Overview

Dataset statistics

Number of variables20
Number of observations108035
Missing cells422997
Missing cells (%)19.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.3 MiB
Average record size in memory168.0 B

Variable types

Categorical7
Numeric13

Alerts

StationId has a high cardinality: 110 distinct values High cardinality
Date has a high cardinality: 2009 distinct values High cardinality
StationName has a high cardinality: 110 distinct values High cardinality
PM2.5 is highly correlated with PM10 and 4 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 6 other fieldsHigh correlation
NO is highly correlated with PM2.5 and 4 other fieldsHigh correlation
NO2 is highly correlated with PM2.5 and 4 other fieldsHigh correlation
NOx is highly correlated with PM10 and 3 other fieldsHigh correlation
NH3 is highly correlated with PM2.5 and 2 other fieldsHigh correlation
CO is highly correlated with PM10 and 1 other fieldsHigh correlation
Benzene is highly correlated with Toluene and 1 other fieldsHigh correlation
Toluene is highly correlated with Benzene and 1 other fieldsHigh correlation
Xylene is highly correlated with Benzene and 1 other fieldsHigh correlation
AQI is highly correlated with PM2.5 and 6 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 3 other fieldsHigh correlation
NO is highly correlated with PM10 and 2 other fieldsHigh correlation
NO2 is highly correlated with NO and 1 other fieldsHigh correlation
NOx is highly correlated with PM10 and 2 other fieldsHigh correlation
Toluene is highly correlated with XyleneHigh correlation
Xylene is highly correlated with TolueneHigh correlation
AQI is highly correlated with PM2.5 and 1 other fieldsHigh correlation
PM2.5 is highly correlated with PM10 and 1 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 1 other fieldsHigh correlation
NO is highly correlated with NOxHigh correlation
NO2 is highly correlated with NOxHigh correlation
NOx is highly correlated with NO and 1 other fieldsHigh correlation
Benzene is highly correlated with Toluene and 1 other fieldsHigh correlation
Toluene is highly correlated with Benzene and 1 other fieldsHigh correlation
Xylene is highly correlated with Benzene and 1 other fieldsHigh correlation
AQI is highly correlated with PM2.5 and 1 other fieldsHigh correlation
State is highly correlated with CityHigh correlation
City is highly correlated with StateHigh correlation
PM2.5 is highly correlated with PM10 and 2 other fieldsHigh correlation
PM10 is highly correlated with PM2.5 and 4 other fieldsHigh correlation
NO is highly correlated with NO2 and 1 other fieldsHigh correlation
NO2 is highly correlated with NO and 1 other fieldsHigh correlation
NOx is highly correlated with PM10 and 2 other fieldsHigh correlation
CO is highly correlated with AQIHigh correlation
SO2 is highly correlated with CityHigh correlation
Benzene is highly correlated with TolueneHigh correlation
Toluene is highly correlated with BenzeneHigh correlation
AQI is highly correlated with PM2.5 and 3 other fieldsHigh correlation
AQI_Bucket is highly correlated with PM2.5 and 4 other fieldsHigh correlation
City is highly correlated with SO2 and 2 other fieldsHigh correlation
State is highly correlated with PM10 and 2 other fieldsHigh correlation
PM2.5 has 21625 (20.0%) missing values Missing
PM10 has 42706 (39.5%) missing values Missing
NO has 17106 (15.8%) missing values Missing
NO2 has 16547 (15.3%) missing values Missing
NOx has 15500 (14.3%) missing values Missing
NH3 has 48105 (44.5%) missing values Missing
CO has 12998 (12.0%) missing values Missing
SO2 has 25204 (23.3%) missing values Missing
O3 has 25568 (23.7%) missing values Missing
Benzene has 31455 (29.1%) missing values Missing
Toluene has 38702 (35.8%) missing values Missing
Xylene has 85137 (78.8%) missing values Missing
AQI has 21010 (19.4%) missing values Missing
AQI_Bucket has 21010 (19.4%) missing values Missing
Benzene is highly skewed (γ1 = 21.61702016) Skewed
NOx has 4776 (4.4%) zeros Zeros
CO has 7280 (6.7%) zeros Zeros
Benzene has 12602 (11.7%) zeros Zeros
Toluene has 10455 (9.7%) zeros Zeros
Xylene has 6083 (5.6%) zeros Zeros

Reproduction

Analysis started2022-01-06 06:53:36.556489
Analysis finished2022-01-06 06:54:12.270600
Duration35.71 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

StationId
Categorical

HIGH CARDINALITY

Distinct110
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
KA009
 
2009
DL008
 
2009
DL033
 
2009
KA003
 
2009
TN003
 
2009
Other values (105)
97990 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters540175
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAP001
2nd rowAP001
3rd rowAP001
4th rowAP001
5th rowAP001

Common Values

ValueCountFrequency (%)
KA0092009
 
1.9%
DL0082009
 
1.9%
DL0332009
 
1.9%
KA0032009
 
1.9%
TN0032009
 
1.9%
GJ0012009
 
1.9%
DL0072009
 
1.9%
DL0212009
 
1.9%
TN0012009
 
1.9%
UP0122009
 
1.9%
Other values (100)87945
81.4%

Length

2022-01-06T12:24:12.408107image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ka0092009
 
1.9%
tn0012009
 
1.9%
up0142009
 
1.9%
tn0042009
 
1.9%
dl0132009
 
1.9%
mh0052009
 
1.9%
up0122009
 
1.9%
dl0082009
 
1.9%
dl0212009
 
1.9%
dl0072009
 
1.9%
Other values (100)87945
81.4%

Most occurring characters

ValueCountFrequency (%)
0171293
31.7%
148475
 
9.0%
L47246
 
8.7%
D47223
 
8.7%
323227
 
4.3%
221884
 
4.1%
T15544
 
2.9%
A14911
 
2.8%
414632
 
2.7%
K13572
 
2.5%
Other values (19)122168
22.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number324105
60.0%
Uppercase Letter216070
40.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L47246
21.9%
D47223
21.9%
T15544
 
7.2%
A14911
 
6.9%
K13572
 
6.3%
G10761
 
5.0%
P10022
 
4.6%
H9808
 
4.5%
R8598
 
4.0%
B7064
 
3.3%
Other values (9)31321
14.5%
Decimal Number
ValueCountFrequency (%)
0171293
52.9%
148475
 
15.0%
323227
 
7.2%
221884
 
6.8%
414632
 
4.5%
513083
 
4.0%
88417
 
2.6%
68350
 
2.6%
78231
 
2.5%
96513
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common324105
60.0%
Latin216070
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
L47246
21.9%
D47223
21.9%
T15544
 
7.2%
A14911
 
6.9%
K13572
 
6.3%
G10761
 
5.0%
P10022
 
4.6%
H9808
 
4.5%
R8598
 
4.0%
B7064
 
3.3%
Other values (9)31321
14.5%
Common
ValueCountFrequency (%)
0171293
52.9%
148475
 
15.0%
323227
 
7.2%
221884
 
6.8%
414632
 
4.5%
513083
 
4.0%
88417
 
2.6%
68350
 
2.6%
78231
 
2.5%
96513
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII540175
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0171293
31.7%
148475
 
9.0%
L47246
 
8.7%
D47223
 
8.7%
323227
 
4.3%
221884
 
4.1%
T15544
 
2.9%
A14911
 
2.8%
414632
 
2.7%
K13572
 
2.5%
Other values (19)122168
22.6%

Date
Categorical

HIGH CARDINALITY

Distinct2009
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
2020-03-29
 
108
2020-04-21
 
108
2020-05-17
 
108
2020-06-13
 
108
2020-04-26
 
108
Other values (2004)
107495 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1080350
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017-11-24
2nd row2017-11-25
3rd row2017-11-26
4th row2017-11-27
5th row2017-11-28

Common Values

ValueCountFrequency (%)
2020-03-29108
 
0.1%
2020-04-21108
 
0.1%
2020-05-17108
 
0.1%
2020-06-13108
 
0.1%
2020-04-26108
 
0.1%
2020-05-03108
 
0.1%
2020-04-16108
 
0.1%
2020-05-26108
 
0.1%
2020-04-15108
 
0.1%
2020-04-17108
 
0.1%
Other values (1999)106955
99.0%

Length

2022-01-06T12:24:12.501833image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2020-03-29108
 
0.1%
2020-05-13108
 
0.1%
2020-07-01108
 
0.1%
2020-05-10108
 
0.1%
2020-05-09108
 
0.1%
2020-05-05108
 
0.1%
2020-03-22108
 
0.1%
2020-03-20108
 
0.1%
2020-06-29108
 
0.1%
2020-05-21108
 
0.1%
Other values (1999)106955
99.0%

Most occurring characters

ValueCountFrequency (%)
0260693
24.1%
-216070
20.0%
2191188
17.7%
1179627
16.6%
950830
 
4.7%
844185
 
4.1%
732023
 
3.0%
630810
 
2.9%
528358
 
2.6%
325989
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number864280
80.0%
Dash Punctuation216070
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0260693
30.2%
2191188
22.1%
1179627
20.8%
950830
 
5.9%
844185
 
5.1%
732023
 
3.7%
630810
 
3.6%
528358
 
3.3%
325989
 
3.0%
420577
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
-216070
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1080350
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0260693
24.1%
-216070
20.0%
2191188
17.7%
1179627
16.6%
950830
 
4.7%
844185
 
4.1%
732023
 
3.0%
630810
 
2.9%
528358
 
2.6%
325989
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1080350
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0260693
24.1%
-216070
20.0%
2191188
17.7%
1179627
16.6%
950830
 
4.7%
844185
 
4.1%
732023
 
3.0%
630810
 
2.9%
528358
 
2.6%
325989
 
2.4%

PM2.5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct22395
Distinct (%)25.9%
Missing21625
Missing (%)20.0%
Infinite0
Infinite (%)0.0%
Mean80.27257135
Minimum0.02
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:12.642432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.02
5-th percentile13.02
Q131.88
median55.95
Q399.92
95-th percentile236.5655
Maximum1000
Range999.98
Interquartile range (IQR)68.04

Descriptive statistics

Standard deviation76.52640254
Coefficient of variation (CV)0.9533318948
Kurtosis10.62438324
Mean80.27257135
Median Absolute Deviation (MAD)29.07
Skewness2.563923541
Sum6936352.89
Variance5856.290285
MonotonicityNot monotonic
2022-01-06T12:24:12.783015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1127
 
< 0.1%
31.0823
 
< 0.1%
28.822
 
< 0.1%
24.8322
 
< 0.1%
21.3821
 
< 0.1%
42.521
 
< 0.1%
34.121
 
< 0.1%
29.7521
 
< 0.1%
1520
 
< 0.1%
2120
 
< 0.1%
Other values (22385)86192
79.8%
(Missing)21625
 
20.0%
ValueCountFrequency (%)
0.022
< 0.1%
0.041
< 0.1%
0.151
< 0.1%
0.161
< 0.1%
0.191
< 0.1%
0.21
< 0.1%
0.241
< 0.1%
0.251
< 0.1%
0.281
< 0.1%
0.321
< 0.1%
ValueCountFrequency (%)
10001
< 0.1%
999.991
< 0.1%
9951
< 0.1%
992.671
< 0.1%
949.991
< 0.1%
917.771
< 0.1%
916.671
< 0.1%
914.941
< 0.1%
914.641
< 0.1%
894.751
< 0.1%

PM10
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct29575
Distinct (%)45.3%
Missing42706
Missing (%)39.5%
Infinite0
Infinite (%)0.0%
Mean157.9684272
Minimum0.01
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:12.892378image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile30.27
Q170.15
median122.09
Q3208.67
95-th percentile409.386
Maximum1000
Range999.99
Interquartile range (IQR)138.52

Descriptive statistics

Standard deviation123.4186718
Coefficient of variation (CV)0.7812869569
Kurtosis3.383081112
Mean157.9684272
Median Absolute Deviation (MAD)61.83
Skewness1.644048099
Sum10319919.38
Variance15232.16854
MonotonicityNot monotonic
2022-01-06T12:24:13.017334image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9412
 
< 0.1%
71.8812
 
< 0.1%
71.0512
 
< 0.1%
108.4611
 
< 0.1%
55.6611
 
< 0.1%
56.611
 
< 0.1%
41.6210
 
< 0.1%
3110
 
< 0.1%
46.6810
 
< 0.1%
65.8810
 
< 0.1%
Other values (29565)65220
60.4%
(Missing)42706
39.5%
ValueCountFrequency (%)
0.011
 
< 0.1%
0.021
 
< 0.1%
0.031
 
< 0.1%
0.042
< 0.1%
0.061
 
< 0.1%
0.071
 
< 0.1%
0.081
 
< 0.1%
0.092
< 0.1%
0.13
< 0.1%
0.123
< 0.1%
ValueCountFrequency (%)
10001
< 0.1%
9852
< 0.1%
976.771
< 0.1%
960.981
< 0.1%
955.61
< 0.1%
9521
< 0.1%
9421
< 0.1%
938.51
< 0.1%
936.251
< 0.1%
933.051
< 0.1%

NO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct11963
Distinct (%)13.2%
Missing17106
Missing (%)15.8%
Infinite0
Infinite (%)0.0%
Mean23.12342399
Minimum0.01
Maximum470
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:13.157929image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile1.5
Q14.84
median10.29
Q324.98
95-th percentile93.766
Maximum470
Range469.99
Interquartile range (IQR)20.14

Descriptive statistics

Standard deviation34.49101855
Coefficient of variation (CV)1.49160516
Kurtosis14.05293685
Mean23.12342399
Median Absolute Deviation (MAD)7.02
Skewness3.288711003
Sum2102589.82
Variance1189.63036
MonotonicityNot monotonic
2022-01-06T12:24:13.267278image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.581
 
0.1%
2.8979
 
0.1%
2.9379
 
0.1%
0.7377
 
0.1%
2.4977
 
0.1%
376
 
0.1%
2.8776
 
0.1%
2.8475
 
0.1%
2.9575
 
0.1%
3.9975
 
0.1%
Other values (11953)90159
83.5%
(Missing)17106
 
15.8%
ValueCountFrequency (%)
0.011
 
< 0.1%
0.0210
< 0.1%
0.034
 
< 0.1%
0.042
 
< 0.1%
0.063
 
< 0.1%
0.071
 
< 0.1%
0.083
 
< 0.1%
0.092
 
< 0.1%
0.14
 
< 0.1%
0.114
 
< 0.1%
ValueCountFrequency (%)
4701
< 0.1%
437.851
< 0.1%
436.81
< 0.1%
429.771
< 0.1%
403.941
< 0.1%
390.681
< 0.1%
383.141
< 0.1%
382.441
< 0.1%
374.711
< 0.1%
373.91
< 0.1%

NO2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct12050
Distinct (%)13.2%
Missing16547
Missing (%)15.3%
Infinite0
Infinite (%)0.0%
Mean35.2407601
Minimum0.01
Maximum448.05
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:13.470354image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile5.3935
Q115.09
median27.21
Q346.93
95-th percentile89.88
Maximum448.05
Range448.04
Interquartile range (IQR)31.84

Descriptive statistics

Standard deviation29.51082713
Coefficient of variation (CV)0.8374060901
Kurtosis11.06061697
Mean35.2407601
Median Absolute Deviation (MAD)14.41
Skewness2.359287053
Sum3224106.66
Variance870.8889177
MonotonicityNot monotonic
2022-01-06T12:24:13.626568image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
943
 
< 0.1%
17.5839
 
< 0.1%
2037
 
< 0.1%
16.0737
 
< 0.1%
17.8236
 
< 0.1%
9.4736
 
< 0.1%
0.236
 
< 0.1%
9.1436
 
< 0.1%
9.2235
 
< 0.1%
13.635
 
< 0.1%
Other values (12040)91118
84.3%
(Missing)16547
 
15.3%
ValueCountFrequency (%)
0.014
< 0.1%
0.027
< 0.1%
0.039
< 0.1%
0.043
 
< 0.1%
0.053
 
< 0.1%
0.063
 
< 0.1%
0.077
< 0.1%
0.086
< 0.1%
0.097
< 0.1%
0.18
< 0.1%
ValueCountFrequency (%)
448.051
< 0.1%
397.771
< 0.1%
397.311
< 0.1%
394.041
< 0.1%
393.081
< 0.1%
369.031
< 0.1%
363.751
< 0.1%
362.731
< 0.1%
362.51
< 0.1%
362.211
< 0.1%

NOx
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct15608
Distinct (%)16.9%
Missing15500
Missing (%)14.3%
Infinite0
Infinite (%)0.0%
Mean41.19505538
Minimum0
Maximum467.63
Zeros4776
Zeros (%)4.4%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:13.829644image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q113.97
median26.66
Q350.5
95-th percentile134.153
Maximum467.63
Range467.63
Interquartile range (IQR)36.53

Descriptive statistics

Standard deviation45.1459756
Coefficient of variation (CV)1.095907632
Kurtosis8.454914527
Mean41.19505538
Median Absolute Deviation (MAD)15.84
Skewness2.53978547
Sum3811984.45
Variance2038.159113
MonotonicityNot monotonic
2022-01-06T12:24:13.954616image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
04776
 
4.4%
6.24601
 
0.6%
2.21516
 
0.5%
9.0537
 
< 0.1%
15.7534
 
< 0.1%
16.5634
 
< 0.1%
11.0234
 
< 0.1%
22.0234
 
< 0.1%
17.0134
 
< 0.1%
15.0933
 
< 0.1%
Other values (15598)86402
80.0%
(Missing)15500
 
14.3%
ValueCountFrequency (%)
04776
4.4%
0.017
 
< 0.1%
0.023
 
< 0.1%
0.0314
 
< 0.1%
0.0413
 
< 0.1%
0.054
 
< 0.1%
0.062
 
< 0.1%
0.073
 
< 0.1%
0.082
 
< 0.1%
0.092
 
< 0.1%
ValueCountFrequency (%)
467.631
< 0.1%
453.611
< 0.1%
442.691
< 0.1%
440.311
< 0.1%
434.91
< 0.1%
429.381
< 0.1%
402.271
< 0.1%
399.871
< 0.1%
395.861
< 0.1%
395.331
< 0.1%

NH3
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct9119
Distinct (%)15.2%
Missing48105
Missing (%)44.5%
Infinite0
Infinite (%)0.0%
Mean28.73287519
Minimum0.01
Maximum418.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:14.079584image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile3.85
Q111.9
median23.59
Q338.1375
95-th percentile70.23
Maximum418.9
Range418.89
Interquartile range (IQR)26.2375

Descriptive statistics

Standard deviation24.89779732
Coefficient of variation (CV)0.8665264843
Kurtosis22.77179802
Mean28.73287519
Median Absolute Deviation (MAD)12.6
Skewness3.218919233
Sum1721961.21
Variance619.9003114
MonotonicityNot monotonic
2022-01-06T12:24:14.267040image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.4249
 
< 0.1%
6.2942
 
< 0.1%
14.6239
 
< 0.1%
6.338
 
< 0.1%
6.2838
 
< 0.1%
6.3136
 
< 0.1%
6.634
 
< 0.1%
10.4233
 
< 0.1%
6.2532
 
< 0.1%
6.3232
 
< 0.1%
Other values (9109)59557
55.1%
(Missing)48105
44.5%
ValueCountFrequency (%)
0.014
 
< 0.1%
0.029
 
< 0.1%
0.031
 
< 0.1%
0.042
 
< 0.1%
0.052
 
< 0.1%
0.066
 
< 0.1%
0.071
 
< 0.1%
0.082
 
< 0.1%
0.091
 
< 0.1%
0.123
< 0.1%
ValueCountFrequency (%)
418.91
< 0.1%
408.581
< 0.1%
379.321
< 0.1%
371.361
< 0.1%
365.681
< 0.1%
361.751
< 0.1%
356.731
< 0.1%
352.891
< 0.1%
349.251
< 0.1%
335.911
< 0.1%

CO
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct2352
Distinct (%)2.5%
Missing12998
Missing (%)12.0%
Infinite0
Infinite (%)0.0%
Mean1.60574934
Minimum0
Maximum175.81
Zeros7280
Zeros (%)6.7%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:14.392012image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.53
median0.91
Q31.45
95-th percentile3.38
Maximum175.81
Range175.81
Interquartile range (IQR)0.92

Descriptive statistics

Standard deviation4.369577753
Coefficient of variation (CV)2.721207878
Kurtosis224.1688928
Mean1.60574934
Median Absolute Deviation (MAD)0.43
Skewness12.19795088
Sum152605.6
Variance19.09320974
MonotonicityNot monotonic
2022-01-06T12:24:14.626333image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
07280
 
6.7%
0.64713
 
0.7%
0.66696
 
0.6%
0.7688
 
0.6%
0.78679
 
0.6%
0.74673
 
0.6%
0.68671
 
0.6%
0.6662
 
0.6%
0.79659
 
0.6%
0.76656
 
0.6%
Other values (2342)81660
75.6%
(Missing)12998
 
12.0%
ValueCountFrequency (%)
07280
6.7%
0.0196
 
0.1%
0.02125
 
0.1%
0.0354
 
< 0.1%
0.0452
 
< 0.1%
0.0573
 
0.1%
0.0654
 
< 0.1%
0.0745
 
< 0.1%
0.0854
 
< 0.1%
0.0959
 
0.1%
ValueCountFrequency (%)
175.811
< 0.1%
145.321
< 0.1%
134.851
< 0.1%
132.471
< 0.1%
132.071
< 0.1%
124.011
< 0.1%
119.681
< 0.1%
119.31
< 0.1%
118.021
< 0.1%
1181
< 0.1%

SO2
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct5801
Distinct (%)7.0%
Missing25204
Missing (%)23.3%
Infinite0
Infinite (%)0.0%
Mean12.2576341
Minimum0.01
Maximum195.65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:14.845032image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile2.07
Q15.04
median8.95
Q314.92
95-th percentile32.54
Maximum195.65
Range195.64
Interquartile range (IQR)9.88

Descriptive statistics

Standard deviation12.98472338
Coefficient of variation (CV)1.059317261
Kurtosis34.72063189
Mean12.2576341
Median Absolute Deviation (MAD)4.54
Skewness4.581281421
Sum1015312.09
Variance168.6030412
MonotonicityNot monotonic
2022-01-06T12:24:14.970001image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.484
 
0.1%
3.3880
 
0.1%
479
 
0.1%
5.8678
 
0.1%
3.2875
 
0.1%
3.5175
 
0.1%
3.4875
 
0.1%
6.4274
 
0.1%
5.874
 
0.1%
6.9974
 
0.1%
Other values (5791)82063
76.0%
(Missing)25204
 
23.3%
ValueCountFrequency (%)
0.012
 
< 0.1%
0.021
 
< 0.1%
0.033
 
< 0.1%
0.046
< 0.1%
0.053
 
< 0.1%
0.068
< 0.1%
0.073
 
< 0.1%
0.0810
< 0.1%
0.094
 
< 0.1%
0.16
< 0.1%
ValueCountFrequency (%)
195.651
< 0.1%
193.861
< 0.1%
187.021
< 0.1%
186.081
< 0.1%
182.391
< 0.1%
180.851
< 0.1%
179.181
< 0.1%
178.931
< 0.1%
178.631
< 0.1%
178.581
< 0.1%

O3
Real number (ℝ≥0)

MISSING

Distinct11166
Distinct (%)13.5%
Missing25568
Missing (%)23.7%
Infinite0
Infinite (%)0.0%
Mean38.13483551
Minimum0.01
Maximum963
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:15.094971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile7.02
Q118.895
median30.84
Q347.14
95-th percentile81
Maximum963
Range962.99
Interquartile range (IQR)28.245

Descriptive statistics

Standard deviation39.12800382
Coefficient of variation (CV)1.026043598
Kurtosis75.83136294
Mean38.13483551
Median Absolute Deviation (MAD)13.49
Skewness6.844483445
Sum3144865.48
Variance1531.000683
MonotonicityNot monotonic
2022-01-06T12:24:15.235564image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.4835
 
< 0.1%
22.9434
 
< 0.1%
23.633
 
< 0.1%
22.533
 
< 0.1%
34.432
 
< 0.1%
25.6431
 
< 0.1%
19.1231
 
< 0.1%
23.5931
 
< 0.1%
21.430
 
< 0.1%
21.2829
 
< 0.1%
Other values (11156)82148
76.0%
(Missing)25568
 
23.7%
ValueCountFrequency (%)
0.014
 
< 0.1%
0.029
< 0.1%
0.034
 
< 0.1%
0.043
 
< 0.1%
0.054
 
< 0.1%
0.063
 
< 0.1%
0.073
 
< 0.1%
0.082
 
< 0.1%
0.093
 
< 0.1%
0.111
< 0.1%
ValueCountFrequency (%)
9631
< 0.1%
868.21
< 0.1%
819.061
< 0.1%
777.671
< 0.1%
763.121
< 0.1%
7571
< 0.1%
714.751
< 0.1%
705.321
< 0.1%
698.671
< 0.1%
694.961
< 0.1%

Benzene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct3017
Distinct (%)3.9%
Missing31455
Missing (%)29.1%
Infinite0
Infinite (%)0.0%
Mean3.35802925
Minimum0
Maximum455.03
Zeros12602
Zeros (%)11.7%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:15.360534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.16
median1.21
Q33.61
95-th percentile11.35
Maximum455.03
Range455.03
Interquartile range (IQR)3.45

Descriptive statistics

Standard deviation11.15623448
Coefficient of variation (CV)3.322256492
Kurtosis695.2028255
Mean3.35802925
Median Absolute Deviation (MAD)1.21
Skewness21.61702016
Sum257157.88
Variance124.4615677
MonotonicityNot monotonic
2022-01-06T12:24:15.563626image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
012602
 
11.7%
0.1689
 
0.6%
0.01541
 
0.5%
0.02474
 
0.4%
0.12439
 
0.4%
0.11422
 
0.4%
0.03420
 
0.4%
0.05409
 
0.4%
0.04403
 
0.4%
0.13398
 
0.4%
Other values (3007)59783
55.3%
(Missing)31455
29.1%
ValueCountFrequency (%)
012602
11.7%
0.01541
 
0.5%
0.02474
 
0.4%
0.03420
 
0.4%
0.04403
 
0.4%
0.05409
 
0.4%
0.06374
 
0.3%
0.07321
 
0.3%
0.08337
 
0.3%
0.09337
 
0.3%
ValueCountFrequency (%)
455.031
< 0.1%
454.851
< 0.1%
449.381
< 0.1%
448.591
< 0.1%
445.831
< 0.1%
443.631
< 0.1%
438.011
< 0.1%
435.91
< 0.1%
435.091
< 0.1%
432.941
< 0.1%

Toluene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct8713
Distinct (%)12.6%
Missing38702
Missing (%)35.8%
Infinite0
Infinite (%)0.0%
Mean15.34539426
Minimum0
Maximum454.85
Zeros10455
Zeros (%)9.7%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:15.688582image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.69
median4.33
Q317.51
95-th percentile64.934
Maximum454.85
Range454.85
Interquartile range (IQR)16.82

Descriptive statistics

Standard deviation29.3485873
Coefficient of variation (CV)1.912533938
Kurtosis33.33721267
Mean15.34539426
Median Absolute Deviation (MAD)4.33
Skewness4.598334818
Sum1063942.22
Variance861.3395766
MonotonicityNot monotonic
2022-01-06T12:24:15.860418image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
010455
 
9.7%
0.01268
 
0.2%
0.02199
 
0.2%
0.07198
 
0.2%
0.03179
 
0.2%
0.04173
 
0.2%
0.1173
 
0.2%
0.08167
 
0.2%
0.06161
 
0.1%
0.09153
 
0.1%
Other values (8703)57207
53.0%
(Missing)38702
35.8%
ValueCountFrequency (%)
010455
9.7%
0.01268
 
0.2%
0.02199
 
0.2%
0.03179
 
0.2%
0.04173
 
0.2%
0.05147
 
0.1%
0.06161
 
0.1%
0.07198
 
0.2%
0.08167
 
0.2%
0.09153
 
0.1%
ValueCountFrequency (%)
454.851
< 0.1%
454.121
< 0.1%
449.141
< 0.1%
448.871
< 0.1%
445.841
< 0.1%
443.631
< 0.1%
437.771
< 0.1%
435.941
< 0.1%
434.921
< 0.1%
433.021
< 0.1%

Xylene
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct1892
Distinct (%)8.3%
Missing85137
Missing (%)78.8%
Infinite0
Infinite (%)0.0%
Mean2.423446153
Minimum0
Maximum170.37
Zeros6083
Zeros (%)5.6%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:15.985405image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.4
Q32.11
95-th percentile10.7315
Maximum170.37
Range170.37
Interquartile range (IQR)2.11

Descriptive statistics

Standard deviation6.472408501
Coefficient of variation (CV)2.670745745
Kurtosis119.5691605
Mean2.423446153
Median Absolute Deviation (MAD)0.4
Skewness8.629738992
Sum55492.07
Variance41.8920718
MonotonicityNot monotonic
2022-01-06T12:24:16.110358image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06083
 
5.6%
0.01500
 
0.5%
0.02423
 
0.4%
0.1327
 
0.3%
0.03297
 
0.3%
0.04241
 
0.2%
0.05188
 
0.2%
0.12183
 
0.2%
0.06180
 
0.2%
0.11168
 
0.2%
Other values (1882)14308
 
13.2%
(Missing)85137
78.8%
ValueCountFrequency (%)
06083
5.6%
0.01500
 
0.5%
0.02423
 
0.4%
0.03297
 
0.3%
0.04241
 
0.2%
0.05188
 
0.2%
0.06180
 
0.2%
0.07151
 
0.1%
0.08148
 
0.1%
0.09135
 
0.1%
ValueCountFrequency (%)
170.371
< 0.1%
137.451
< 0.1%
133.61
< 0.1%
132.971
< 0.1%
129.281
< 0.1%
125.181
< 0.1%
123.291
< 0.1%
116.621
< 0.1%
109.981
< 0.1%
109.231
< 0.1%

AQI
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct930
Distinct (%)1.1%
Missing21010
Missing (%)19.4%
Infinite0
Infinite (%)0.0%
Mean179.7492904
Minimum8
Maximum2049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 MiB
2022-01-06T12:24:16.282192image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile47
Q186
median132
Q3254
95-th percentile415
Maximum2049
Range2041
Interquartile range (IQR)168

Descriptive statistics

Standard deviation131.3243389
Coefficient of variation (CV)0.7305972588
Kurtosis8.532544269
Mean179.7492904
Median Absolute Deviation (MAD)62
Skewness1.930087687
Sum15642682
Variance17246.08198
MonotonicityNot monotonic
2022-01-06T12:24:16.438407image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
104615
 
0.6%
102593
 
0.5%
106587
 
0.5%
108561
 
0.5%
100560
 
0.5%
88549
 
0.5%
98547
 
0.5%
90546
 
0.5%
92546
 
0.5%
78545
 
0.5%
Other values (920)81376
75.3%
(Missing)21010
 
19.4%
ValueCountFrequency (%)
81
 
< 0.1%
102
 
< 0.1%
132
 
< 0.1%
147
 
< 0.1%
155
 
< 0.1%
1610
 
< 0.1%
1713
 
< 0.1%
1811
 
< 0.1%
1936
< 0.1%
2049
< 0.1%
ValueCountFrequency (%)
20491
< 0.1%
19171
< 0.1%
18421
< 0.1%
17471
< 0.1%
17191
< 0.1%
16721
< 0.1%
16461
< 0.1%
16301
< 0.1%
16131
< 0.1%
15951
< 0.1%

AQI_Bucket
Categorical

HIGH CORRELATION
MISSING

Distinct6
Distinct (%)< 0.1%
Missing21010
Missing (%)19.4%
Memory size1.6 MiB
Moderate
29417 
Satisfactory
23636 
Very Poor
11762 
Poor
11493 
Good
5510 

Length

Max length12
Median length8
Mean length8.32036771
Min length4

Characters and Unicode

Total characters724080
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowModerate
2nd rowModerate
3rd rowModerate
4th rowModerate
5th rowModerate

Common Values

ValueCountFrequency (%)
Moderate29417
27.2%
Satisfactory23636
21.9%
Very Poor11762
 
10.9%
Poor11493
 
10.6%
Good5510
 
5.1%
Severe5207
 
4.8%
(Missing)21010
19.4%

Length

2022-01-06T12:24:16.563376image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T12:24:16.641483image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
moderate29417
29.8%
satisfactory23636
23.9%
poor23255
23.5%
very11762
 
11.9%
good5510
 
5.6%
severe5207
 
5.3%

Most occurring characters

ValueCountFrequency (%)
o110583
15.3%
r93277
12.9%
e86217
11.9%
a76689
10.6%
t76689
10.6%
y35398
 
4.9%
d34927
 
4.8%
M29417
 
4.1%
S28843
 
4.0%
c23636
 
3.3%
Other values (8)128404
17.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter613531
84.7%
Uppercase Letter98787
 
13.6%
Space Separator11762
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o110583
18.0%
r93277
15.2%
e86217
14.1%
a76689
12.5%
t76689
12.5%
y35398
 
5.8%
d34927
 
5.7%
c23636
 
3.9%
s23636
 
3.9%
f23636
 
3.9%
Other values (2)28843
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
M29417
29.8%
S28843
29.2%
P23255
23.5%
V11762
 
11.9%
G5510
 
5.6%
Space Separator
ValueCountFrequency (%)
11762
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin712318
98.4%
Common11762
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o110583
15.5%
r93277
13.1%
e86217
12.1%
a76689
10.8%
t76689
10.8%
y35398
 
5.0%
d34927
 
4.9%
M29417
 
4.1%
S28843
 
4.0%
c23636
 
3.3%
Other values (7)116642
16.4%
Common
ValueCountFrequency (%)
11762
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII724080
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o110583
15.3%
r93277
12.9%
e86217
11.9%
a76689
10.6%
t76689
10.6%
y35398
 
4.9%
d34927
 
4.8%
M29417
 
4.1%
S28843
 
4.0%
c23636
 
3.3%
Other values (8)128404
17.7%

StationName
Categorical

HIGH CARDINALITY

Distinct110
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
BWSSB Kadabesanahalli, Bengaluru - CPCB
 
2009
Central School, Lucknow - CPCB
 
2009
NSIT Dwarka, Delhi - CPCB
 
2009
Shadipur, Delhi - CPCB
 
2009
CRRI Mathura Road, Delhi - IMD
 
2009
Other values (105)
97990 

Length

Max length53
Median length28
Mean length29.00140695
Min length17

Characters and Unicode

Total characters3133167
Distinct characters59
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSecretariat, Amaravati - APPCB
2nd rowSecretariat, Amaravati - APPCB
3rd rowSecretariat, Amaravati - APPCB
4th rowSecretariat, Amaravati - APPCB
5th rowSecretariat, Amaravati - APPCB

Common Values

ValueCountFrequency (%)
BWSSB Kadabesanahalli, Bengaluru - CPCB2009
 
1.9%
Central School, Lucknow - CPCB2009
 
1.9%
NSIT Dwarka, Delhi - CPCB2009
 
1.9%
Shadipur, Delhi - CPCB2009
 
1.9%
CRRI Mathura Road, Delhi - IMD2009
 
1.9%
Lalbagh, Lucknow - CPCB2009
 
1.9%
Maninagar, Ahmedabad - GPCB2009
 
1.9%
Bandra, Mumbai - MPCB2009
 
1.9%
Alandur Bus Depot, Chennai - CPCB2009
 
1.9%
IHBAS, Dilshad Garden, Delhi - CPCB2009
 
1.9%
Other values (100)87945
81.4%

Length

2022-01-06T12:24:16.766452image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
108035
 
20.1%
delhi45360
 
8.5%
cpcb28911
 
5.4%
dpcc24404
 
4.6%
bengaluru11996
 
2.2%
imd8937
 
1.7%
hyderabad8752
 
1.6%
tspcb8752
 
1.6%
nagar7703
 
1.4%
kspcb7073
 
1.3%
Other values (220)276331
51.5%

Most occurring characters

ValueCountFrequency (%)
428219
 
13.7%
a302787
 
9.7%
C180834
 
5.8%
r148939
 
4.8%
i148765
 
4.7%
e133988
 
4.3%
P126317
 
4.0%
l120877
 
3.9%
B112958
 
3.6%
,111080
 
3.5%
Other values (49)1318403
42.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1631535
52.1%
Uppercase Letter841067
26.8%
Space Separator428219
 
13.7%
Other Punctuation114486
 
3.7%
Dash Punctuation110222
 
3.5%
Decimal Number4778
 
0.2%
Open Punctuation1430
 
< 0.1%
Close Punctuation1430
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a302787
18.6%
r148939
9.1%
i148765
9.1%
e133988
 
8.2%
l120877
 
7.4%
h109044
 
6.7%
n108377
 
6.6%
u85726
 
5.3%
o67485
 
4.1%
t63428
 
3.9%
Other values (15)342119
21.0%
Uppercase Letter
ValueCountFrequency (%)
C180834
21.5%
P126317
15.0%
B112958
13.4%
D93467
11.1%
S63643
 
7.6%
M36849
 
4.4%
I28704
 
3.4%
T26309
 
3.1%
A26234
 
3.1%
R20517
 
2.4%
Other values (13)125235
14.9%
Decimal Number
ValueCountFrequency (%)
21580
33.1%
51161
24.3%
31036
21.7%
8882
18.5%
1119
 
2.5%
Other Punctuation
ValueCountFrequency (%)
,111080
97.0%
.3406
 
3.0%
Space Separator
ValueCountFrequency (%)
428219
100.0%
Dash Punctuation
ValueCountFrequency (%)
-110222
100.0%
Open Punctuation
ValueCountFrequency (%)
(1430
100.0%
Close Punctuation
ValueCountFrequency (%)
)1430
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2472602
78.9%
Common660565
 
21.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a302787
 
12.2%
C180834
 
7.3%
r148939
 
6.0%
i148765
 
6.0%
e133988
 
5.4%
P126317
 
5.1%
l120877
 
4.9%
B112958
 
4.6%
h109044
 
4.4%
n108377
 
4.4%
Other values (38)979716
39.6%
Common
ValueCountFrequency (%)
428219
64.8%
,111080
 
16.8%
-110222
 
16.7%
.3406
 
0.5%
21580
 
0.2%
(1430
 
0.2%
)1430
 
0.2%
51161
 
0.2%
31036
 
0.2%
8882
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3133167
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
428219
 
13.7%
a302787
 
9.7%
C180834
 
5.8%
r148939
 
4.8%
i148765
 
4.7%
e133988
 
4.3%
P126317
 
4.0%
l120877
 
3.9%
B112958
 
3.6%
,111080
 
3.5%
Other values (49)1318403
42.1%

City
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
Delhi
45360 
Bengaluru
11996 
Hyderabad
8752 
Chennai
6406 
Lucknow
6099 
Other values (21)
29422 

Length

Max length18
Median length6
Mean length6.815059934
Min length5

Characters and Unicode

Total characters736265
Distinct characters38
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAmaravati
2nd rowAmaravati
3rd rowAmaravati
4th rowAmaravati
5th rowAmaravati

Common Values

ValueCountFrequency (%)
Delhi45360
42.0%
Bengaluru11996
 
11.1%
Hyderabad8752
 
8.1%
Chennai6406
 
5.9%
Lucknow6099
 
5.6%
Mumbai5504
 
5.1%
Kolkata3165
 
2.9%
Jaipur3089
 
2.9%
Gurugram2831
 
2.6%
Patna2678
 
2.5%
Other values (16)12155
 
11.3%

Length

2022-01-06T12:24:17.016394image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
delhi45360
42.0%
bengaluru11996
 
11.1%
hyderabad8752
 
8.1%
chennai6406
 
5.9%
lucknow6099
 
5.6%
mumbai5504
 
5.1%
kolkata3165
 
2.9%
jaipur3089
 
2.9%
gurugram2831
 
2.6%
patna2678
 
2.5%
Other values (16)12155
 
11.3%

Most occurring characters

ValueCountFrequency (%)
a87703
11.9%
e75834
10.3%
i67022
 
9.1%
l62630
 
8.5%
h61706
 
8.4%
u47514
 
6.5%
D45360
 
6.2%
r42325
 
5.7%
n39265
 
5.3%
d21826
 
3.0%
Other values (28)185080
25.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter628230
85.3%
Uppercase Letter108035
 
14.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a87703
14.0%
e75834
12.1%
i67022
10.7%
l62630
10.0%
h61706
9.8%
u47514
7.6%
r42325
6.7%
n39265
 
6.3%
d21826
 
3.5%
b16651
 
2.7%
Other values (13)105754
16.8%
Uppercase Letter
ValueCountFrequency (%)
D45360
42.0%
B13223
 
12.2%
H8752
 
8.1%
C7096
 
6.6%
L6099
 
5.6%
M5504
 
5.1%
A4294
 
4.0%
J4258
 
3.9%
G3333
 
3.1%
K3327
 
3.1%
Other values (5)6789
 
6.3%

Most occurring scripts

ValueCountFrequency (%)
Latin736265
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a87703
11.9%
e75834
10.3%
i67022
 
9.1%
l62630
 
8.5%
h61706
 
8.4%
u47514
 
6.5%
D45360
 
6.2%
r42325
 
5.7%
n39265
 
5.3%
d21826
 
3.0%
Other values (28)185080
25.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII736265
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a87703
11.9%
e75834
10.3%
i67022
 
9.1%
l62630
 
8.5%
h61706
 
8.4%
u47514
 
6.5%
D45360
 
6.2%
r42325
 
5.7%
n39265
 
5.3%
d21826
 
3.0%
Other values (28)185080
25.1%

State
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.6 MiB
Delhi
45360 
Karnataka
11996 
Telangana
8752 
Tamil Nadu
6792 
Uttar Pradesh
6099 
Other values (16)
29036 

Length

Max length14
Median length7
Mean length7.558744851
Min length5

Characters and Unicode

Total characters816609
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAndhra Pradesh
2nd rowAndhra Pradesh
3rd rowAndhra Pradesh
4th rowAndhra Pradesh
5th rowAndhra Pradesh

Common Values

ValueCountFrequency (%)
Delhi45360
42.0%
Karnataka11996
 
11.1%
Telangana8752
 
8.1%
Tamil Nadu6792
 
6.3%
Uttar Pradesh6099
 
5.6%
Maharashtra5504
 
5.1%
West Bengal3165
 
2.9%
Rajasthan3089
 
2.9%
Haryana2831
 
2.6%
Bihar2678
 
2.5%
Other values (11)11769
 
10.9%

Length

2022-01-06T12:24:17.110121image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
delhi45360
35.8%
karnataka11996
 
9.5%
pradesh8801
 
6.9%
telangana8752
 
6.9%
tamil6792
 
5.4%
nadu6792
 
5.4%
uttar6099
 
4.8%
maharashtra5504
 
4.3%
west3165
 
2.5%
bengal3165
 
2.5%
Other values (14)20367
16.1%

Most occurring characters

ValueCountFrequency (%)
a166079
20.3%
h78757
9.6%
e71129
8.7%
l65955
 
8.1%
i57110
 
7.0%
r50997
 
6.2%
D45360
 
5.6%
n43692
 
5.4%
t37961
 
4.6%
s23426
 
2.9%
Other values (26)176143
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter671058
82.2%
Uppercase Letter126793
 
15.5%
Space Separator18758
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a166079
24.7%
h78757
11.7%
e71129
10.6%
l65955
 
9.8%
i57110
 
8.5%
r50997
 
7.6%
n43692
 
6.5%
t37961
 
5.7%
s23426
 
3.5%
d21631
 
3.2%
Other values (9)54321
 
8.1%
Uppercase Letter
ValueCountFrequency (%)
D45360
35.8%
T15544
 
12.3%
K13572
 
10.7%
P10022
 
7.9%
N6792
 
5.4%
M6216
 
4.9%
U6099
 
4.8%
B5843
 
4.6%
W3165
 
2.5%
R3089
 
2.4%
Other values (6)11091
 
8.7%
Space Separator
ValueCountFrequency (%)
18758
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin797851
97.7%
Common18758
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a166079
20.8%
h78757
9.9%
e71129
8.9%
l65955
 
8.3%
i57110
 
7.2%
r50997
 
6.4%
D45360
 
5.7%
n43692
 
5.5%
t37961
 
4.8%
s23426
 
2.9%
Other values (25)157385
19.7%
Common
ValueCountFrequency (%)
18758
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII816609
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a166079
20.3%
h78757
9.6%
e71129
8.7%
l65955
 
8.1%
i57110
 
7.0%
r50997
 
6.2%
D45360
 
5.6%
n43692
 
5.4%
t37961
 
4.6%
s23426
 
2.9%
Other values (26)176143
21.6%

Status
Categorical

Distinct2
Distinct (%)< 0.1%
Missing324
Missing (%)0.3%
Memory size1.6 MiB
Active
106102 
Inactive
 
1609

Length

Max length8
Median length6
Mean length6.029876243
Min length6

Characters and Unicode

Total characters649484
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowActive
2nd rowActive
3rd rowActive
4th rowActive
5th rowActive

Common Values

ValueCountFrequency (%)
Active106102
98.2%
Inactive1609
 
1.5%
(Missing)324
 
0.3%

Length

2022-01-06T12:24:17.266338image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-06T12:24:17.328822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
active106102
98.5%
inactive1609
 
1.5%

Most occurring characters

ValueCountFrequency (%)
c107711
16.6%
t107711
16.6%
i107711
16.6%
v107711
16.6%
e107711
16.6%
A106102
16.3%
I1609
 
0.2%
n1609
 
0.2%
a1609
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter541773
83.4%
Uppercase Letter107711
 
16.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c107711
19.9%
t107711
19.9%
i107711
19.9%
v107711
19.9%
e107711
19.9%
n1609
 
0.3%
a1609
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
A106102
98.5%
I1609
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Latin649484
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c107711
16.6%
t107711
16.6%
i107711
16.6%
v107711
16.6%
e107711
16.6%
A106102
16.3%
I1609
 
0.2%
n1609
 
0.2%
a1609
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII649484
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c107711
16.6%
t107711
16.6%
i107711
16.6%
v107711
16.6%
e107711
16.6%
A106102
16.3%
I1609
 
0.2%
n1609
 
0.2%
a1609
 
0.2%

Interactions

2022-01-06T12:24:08.482921image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:46.790342image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.609220image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.478758image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.226922image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.094948image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.803224image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.555814image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.398218image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.358250image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.113456image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.906914image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.654494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:08.612717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:46.938412image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.740094image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.598928image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.343851image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.211286image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.925987image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.678927image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.521794image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.479095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.233749image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.054044image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.768513image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:08.751600image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.082305image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.871746image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.742910image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.503380image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.363370image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.070486image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.842949image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.676391image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.607418image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.352178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.184663image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.873620image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:08.907938image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.227884image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.007342image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.871460image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.624141image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.485189image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.186606image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.969580image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.815916image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.732534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.495679image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.319359image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.993750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.038108image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.350963image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.146176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.009276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.762010image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.640735image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.306182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.123976image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.969682image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.882589image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.636997image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.446077image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:07.123376image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.160276image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.490520image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.278438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.138649image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.886888image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.750885image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.436295image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.260045image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:00.095953image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.022815image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.741143image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.571452image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:07.265333image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.323888image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.629211image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.403282image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.268702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.015483image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:54.884828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.556878image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.382505image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:00.240115image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.144142image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:03.871479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.708810image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:07.403786image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.492470image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.765898image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.641582image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.418212image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.145686image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.028925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.686214image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.523995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:00.382098image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.280327image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.030076image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.829338image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:07.524463image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.638741image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:47.928854image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.782827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.551494image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.421799image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.157418image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:56.823322image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.690356image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:00.518217image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.432989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.185881image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:05.954783image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:07.648646image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.763572image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.050106image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:49.896794image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.679702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.559163image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.282207image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.053607image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.837190image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:00.789202image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.554888image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.303043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.111979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:07.800092image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:09.911710image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.185813image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.059095image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.806744image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.691437image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.419419image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.174820image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:58.969210image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:00.931925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.680779image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.536586image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.271982image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:08.037471image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:10.024379image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.301925image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.191241image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:51.917627image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.809246image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.525585image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.277810image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.085555image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.049542image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.777812image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.651743image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.390278image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:08.176849image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:10.160373image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:48.444114image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:50.338963image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:52.058856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:53.965215image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:55.681411image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:57.402821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:23:59.258037image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:01.190365image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:02.955853image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:04.781739image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:06.512425image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2022-01-06T12:24:08.313879image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2022-01-06T12:24:17.430792image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-06T12:24:17.587005image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-06T12:24:17.758837image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-06T12:24:17.899430image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-06T12:24:18.055643image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-06T12:24:10.468589image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-06T12:24:10.935237image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-06T12:24:11.576628image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-06T12:24:12.041562image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

StationIdDatePM2.5PM10NONO2NOxNH3COSO2O3BenzeneTolueneXyleneAQIAQI_BucketStationNameCityStateStatus
0AP0012017-11-2471.36115.751.7520.6512.4012.190.1010.76109.260.175.920.10NaNNaNSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
1AP0012017-11-2581.40124.501.4420.5012.0810.720.1215.24127.090.206.500.06184.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
2AP0012017-11-2678.32129.061.2626.0014.8510.280.1426.96117.440.227.950.08197.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
3AP0012017-11-2788.76135.326.6030.8521.7712.910.1133.59111.810.297.630.12198.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
4AP0012017-11-2864.18104.092.5628.0717.0111.420.0919.00138.180.175.020.07188.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
5AP0012017-11-2972.47114.845.2323.2016.5912.250.1610.55109.740.214.710.08173.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
6AP0012017-11-3069.80114.864.6920.1714.5410.950.1214.07118.090.163.520.06165.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
7AP0012017-12-0173.96113.564.5819.2913.9710.950.1013.90123.800.172.850.04191.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
8AP0012017-12-0289.90140.207.7126.1919.8713.120.1019.37128.730.252.790.07191.0ModerateSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
9AP0012017-12-0387.14130.520.9721.3112.1214.360.1511.41114.800.233.820.04227.0PoorSecretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive

Last rows

StationIdDatePM2.5PM10NONO2NOxNH3COSO2O3BenzeneTolueneXyleneAQIAQI_BucketStationNameCityStateStatus
108025WB0132020-06-2215.1030.982.5918.0420.6330.340.671.5025.841.288.32NaN38.0GoodVictoria, Kolkata - WBPCBKolkataWest BengalActive
108026WB0132020-06-2319.4842.373.0620.9423.9932.530.701.7228.211.655.93NaN44.0GoodVictoria, Kolkata - WBPCBKolkataWest BengalActive
108027WB0132020-06-2420.0546.633.7815.2819.0618.890.667.0841.561.147.24NaN59.0SatisfactoryVictoria, Kolkata - WBPCBKolkataWest BengalActive
108028WB0132020-06-2517.0339.643.2311.4214.6518.980.5711.3931.760.796.85NaN56.0SatisfactoryVictoria, Kolkata - WBPCBKolkataWest BengalActive
108029WB0132020-06-269.7919.8723.5116.5040.0225.090.6610.3430.190.936.37NaN50.0GoodVictoria, Kolkata - WBPCBKolkataWest BengalActive
108030WB0132020-06-278.6516.46NaNNaNNaNNaN0.694.3630.591.327.26NaN50.0GoodVictoria, Kolkata - WBPCBKolkataWest BengalActive
108031WB0132020-06-2811.8018.47NaNNaNNaNNaN0.683.4938.951.427.92NaN65.0SatisfactoryVictoria, Kolkata - WBPCBKolkataWest BengalActive
108032WB0132020-06-2918.6032.2613.65200.87214.2011.400.785.1238.173.528.64NaN63.0SatisfactoryVictoria, Kolkata - WBPCBKolkataWest BengalActive
108033WB0132020-06-3016.0739.307.5629.1336.6929.260.695.8829.641.868.40NaN57.0SatisfactoryVictoria, Kolkata - WBPCBKolkataWest BengalActive
108034WB0132020-07-0110.5036.507.7822.5030.2527.230.582.8013.101.317.39NaN59.0SatisfactoryVictoria, Kolkata - WBPCBKolkataWest BengalActive